-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamped tuning #130
Revamped tuning #130
Conversation
def grid_search( | ||
method: str, | ||
charges: torch.Tensor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would turn the logic around and keep the tune_XXX
method. Also, grid_search
is a very common name. It is not really clear from this that this will find the optimal parameters for the methods.
367f147
to
f554b70
Compare
examples/05-autograd-demo.py
Outdated
@@ -515,3 +518,82 @@ def forward(self, positions, cell, charges): | |||
print(f"Evaluation time:\nPytorch: {time_python}ms\nJitted: {time_jit}ms") | |||
|
|||
# %% | |||
# Other auto-differentiation ideas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO opinion I wouldn't put this example here - even though I think it is good to have it. The tutorial is already 500 lines and with this super long. I rather vote for smaller examples tackling one specific tasks. Finding solutions is much easier if they are shorter. See also the beloved matplotlib examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are there three Tuning base classes
TuningErrorBounds
, TuningTimings
which are hardcoded inside GridSearchBase
? I think a single base class is enough, no?
src/torchpme/utils/tuning/p3m.py
Outdated
CalculatorClass = P3MCalculator | ||
GridSearchParams = { | ||
"interpolation_nodes": [2, 3, 4, 5], | ||
"mesh_spacing": 1 / ((np.exp2(np.arange(2, 8)) - 1) / 2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we give the users the option to choose possibility to give the grid points on which they want to optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mainly because the possible grid points were hard-coded before. If we can do this, would be good. We let the user input a list of their desired mesh_spacing
at the beginning?
) | ||
value = result.sum() | ||
if self._run_backward: | ||
value.backward(retain_graph=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to retain the graph here?
positions.requires_grad_(True) | ||
cell.requires_grad_(True) | ||
charges.requires_grad_(True) | ||
execution_time -= time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a very weird way of storing the result. why not using a temp variable?
execution_time -= time.time() | |
t0 = time.time() |
See below.
|
||
if self._device is torch.device("cuda"): | ||
torch.cuda.synchronize() | ||
execution_time += time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
execution_time += time.time() | |
execution_time += t0 - time.time() |
self._charges = charges | ||
self._cell = cell | ||
self._positions = positions | ||
self._dtype = charges.dtype | ||
self._device = charges.device | ||
self._n_repeat = n_repeat | ||
self._n_warmup = n_warmup | ||
self._run_backward = run_backward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need all of these private properties?
Many of these seem to be only used once and are hardcoded.
Also I think user variables should be stored public.
If I pass positions
I should be able to access them via self.positions
and not as a private property.
a41f780
to
33c9705
Compare
Some notes from our meeting
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update docstrings and tests
write a doc page API page explaining the base class and how we do the tuning (Reuse the text for updating the paper). In the API references I would do a new section tuning
. On the tuning
page I would explain how we do the tuning. Then create one page for each calculator and finally one page for the base classes. One the base class page you explain how you designed these classes and how they work together.
The subpages for each calculator should first display the tuning function and below the classes for the error bounds. In the introduction text of each display the equation for error bounds.
src/torchpme/tuning/base.py
Outdated
positions: torch.Tensor, | ||
cutoff: float, | ||
calculator, | ||
params: list[dict], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need the exponent. Should be able to extract it from the calculator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the Potential
does not necessarily have the attribute exponent
, like CoulombPotential
🤔
src/torchpme/tuning/base.py
Outdated
self._dtype = cell.dtype | ||
self._device = cell.device | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put the dtype here as argument or deduce it from the claculator.
What do you say @E-Rum ?
src/torchpme/tuning/base.py
Outdated
@staticmethod | ||
def _validate_parameters( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now very similar to the one we use in the calculators, right?
Maybe we extact merge both and make them an standalone private function living in utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are still slightly different from each other. The one in the calculators checks smearing
while the one of tuning checks exponent
, but it is possible to extract the common part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay yeah might be useful to have some code sharing if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After getting the standalone functions, do we only call it in the tuning functions, or we still keep it being called during the initialization of the tuner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I am happy with the design. I left some initial comments but we can start making the code ready to go in.
250d8c6
to
135e6d9
Compare
d8e675e
to
4c22071
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good progress. The overall structure looks convincing to me. I have one major question about the importance of the backward path in the tuning.
If we can convince me that this is important we can keep it. Otherwise I suggest only time the forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example lacks explanantions. Linking to the error formulas might be useful plus some more text between the cells explaining the last cells and the plans for the upcoming code.
assert isinstance( | ||
potential, Potential | ||
), f"Potential must be an instance of Potential, got {type(potential)}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather raise a ValueError
here. Asserts are usually for testing and if a code should check something under all circumstances one should use a "real" error. See for example
https://stackoverflow.com/questions/40182944/whats-the-difference-between-raise-try-and-assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if TypeError
is better? And what about those assertions below these lines?
torch-pme/src/torchpme/calculators/calculator.py
Lines 51 to 58 in 410bf74
assert self.dtype == self.potential.dtype, ( | |
f"Potential and Calculator must have the same dtype, got {self.dtype} and " | |
f"{self.potential.dtype}" | |
) | |
assert self.device == self.potential.device, ( | |
f"Potential and Calculator must have the same device, got {self.device} and " | |
f"{self.potential.device}" | |
) |
src/torchpme/tuning/ewald.py
Outdated
r""" | ||
Find the optimal parameters for :class:`torchpme.EwaldCalculator`. | ||
|
||
The error formulas are given `online |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need this anymore here. We give the equations in the documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be nicer to let users know the origin of these equations in the documentation? Otherwise, for me if I am a newcomer, I might think you guys fabricated them 😂 Or we move them to where the equations are?
src/torchpme/tuning/ewald.py
Outdated
Error bounds for :class:`torchpme.calculators.ewald.EwaldCalculator`. | ||
|
||
The error formulas are given `online | ||
<https://www2.icp.uni-stuttgart.de/~icp/mediawiki/images/4/4d/Script_Longrange_Interactions.pdf>`_ | ||
(now not available, need to be updated later). Note the difference notation between | ||
the parameters in the reference and ours: | ||
|
||
.. math:: | ||
|
||
\alpha &= \left( \sqrt{2}\,\mathrm{smearing} \right)^{-1} | ||
|
||
K &= \frac{2 \pi}{\mathrm{lr\_wavelength}} | ||
|
||
r_c &= \mathrm{cutoff} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring seems to be very similar to the tuning function.
Maybe write someting that this class implemenents the error bounds for the real and FT part of the ewald summation ...
src/torchpme/tuning/tuner.py
Outdated
positions.requires_grad_(True) | ||
cell.requires_grad_(True) | ||
charges.requires_grad_(True) | ||
execution_time -= time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use time.monotonic()
. It seems to be better suited for the timings we do here.
See: https://docs.python.org/3/library/time.html#time.monotonic
src/torchpme/tuning/tuner.py
Outdated
if self._run_backward: | ||
positions.requires_grad_(True) | ||
cell.requires_grad_(True) | ||
charges.requires_grad_(True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this is necessary. Is this backward path that uncorrelated to the forward path. Naively I would imagine that if the forward path takes longer it is the same for the backward and this this does not include forces, I don't really see the point.
I'm done with a more-than-decent draft of the example. It explains well how the tuning is done, and then shows how to use the autotuner to also optimize the cutoff. |
Thanks, I take it from here. |
42a5afc
to
1bac763
Compare
Revert the default parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this large change @GardevoirX and @ceriottm !
src/torchpme/tuning/pme.py
Outdated
smearing = torch.as_tensor(smearing) | ||
mesh_spacing = torch.as_tensor(mesh_spacing) | ||
cutoff = torch.as_tensor(cutoff) | ||
interpolation_nodes = torch.as_tensor(interpolation_nodes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using as_tensor
instead of tensor?
... ) | ||
|
||
""" | ||
_validate_parameters(charges, cell, positions, exponent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay makes sense.
This PR introduces two things:
Still more works need to be done, like writing documentations, fixing the pytests and the example, before this PR is ready.
Contributor (creator of pull-request) checklist
Reviewer checklist
📚 Documentation preview 📚: https://torch-pme--130.org.readthedocs.build/en/130/